PMSSC: Parallelizable multi-subset based self-expressive model for subspace clustering
نویسندگان
چکیده
Abstract Subspace clustering methods which embrace a self-expressive model that represents each data point as linear combination of other points in the dataset provide powerful unsupervised learning techniques. However, when dealing with large datasets, representation by referring to all via dictionary suffers from high computational complexity. To alleviate this issue, we introduce parallelizable multi-subset based (PMS) combining multiple subsets, consisting only small proportion samples. The adoption PMS subspace (PMSSC) leads advantages because optimization problems decomposed over subset are small, and can be solved efficiently parallel. Furthermore, PMSSC is able combine coefficient vectors obtained contributes an improvement self-expressiveness. Extensive experiments on synthetic real-world datasets show efficiency effectiveness our approach comparison methods.
منابع مشابه
Model-based subspace clustering
We discuss a model-based approach to identifying clusters of objects based on subsets of attributes, so that the attributes that distinguish a cluster from the rest of the population may depend on the cluster being considered. The method is based on a Pólya urn cluster model for multivariate means and variances, resulting in a multivariate Dirichlet process mixture model. This particular model-...
متن کاملA Self-Training Subspace Clustering
Accurate identification of the cancer types is essential to cancer diagnoses and treatments. Since cancer tissue and normal tissue have different gene expression, gene expression data can be used as an efficient feature source for cancer classification. However, accurate cancer classification directly using original gene expression profiles remains challenging due to the intrinsic high-dimensio...
متن کاملGenetic algorithms for subset selection in model-based clustering
Model-based clustering assumes that the observed data can be represented by a finite mixture model, where each cluster is represented by a parametric distribution. In the multivariate continuous case the Gaussian distribution is often employed. Identifying the subset of relevant clustering variables allows to achieve parsimony of unknown parameters, thus yielding more efficient estimation, clea...
متن کاملConstraint-Based Subspace Clustering
In high dimensional data, the general performance of traditional clustering algorithms decreases. This is partly because the similarity criterion used by these algorithms becomes inadequate in high dimensional space. Another reason is that some dimensions are likely to be irrelevant or contain noisy data, thus hiding a possible clustering. To overcome these problems, subspace clustering techniq...
متن کاملRobust Localized Multi-view Subspace Clustering
In multi-view clustering, different views may have different confidence levels when learning a consensus representation. Existing methods usually address this by assigning distinctive weights to different views. However, due to noisy nature of realworld applications, the confidence levels of samples in the same viewmay also vary. Thus considering a unified weight for a view may lead to suboptim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Visual Media
سال: 2023
ISSN: ['2096-0662', '2096-0433']
DOI: https://doi.org/10.1007/s41095-022-0293-5